We start with an exploratory data analysis and preprocessing.
hirs <- read.table("hirsutism.dat", header = T, sep = "\t", fill = TRUE)
summary(hirs)
## Treatment FGm0 FGm3 FGm6
## Min. :0.000 Min. :14.57 Min. : 4.381 Min. : 1.786
## 1st Qu.:1.000 1st Qu.:16.23 1st Qu.: 9.557 1st Qu.: 7.202
## Median :2.000 Median :17.65 Median :12.643 Median :10.286
## Mean :1.535 Mean :18.57 Mean :13.084 Mean :10.853
## 3rd Qu.:3.000 3rd Qu.:20.17 3rd Qu.:16.219 3rd Qu.:14.204
## Max. :3.000 Max. :28.36 Max. :25.637 Max. :23.411
##
## FGm12 SysPres DiaPres weight
## Min. :-1.163 Min. : 88.0 Min. :46.00 Min. : 41.00
## 1st Qu.: 5.093 1st Qu.:110.0 1st Qu.:65.00 1st Qu.: 57.00
## Median : 7.524 Median :115.0 Median :70.00 Median : 64.00
## Mean : 8.911 Mean :115.9 Mean :70.04 Mean : 68.06
## 3rd Qu.:12.101 3rd Qu.:120.0 3rd Qu.:75.00 3rd Qu.: 74.50
## Max. :22.759 Max. :162.0 Max. :95.00 Max. :113.00
## NA's :8 NA's :8 NA's :8
## height
## Min. :1.480
## 1st Qu.:1.580
## Median :1.610
## Mean :1.613
## 3rd Qu.:1.650
## Max. :1.800
## NA's :8
hirs$Treatment <- as.factor(hirs$Treatment)
summary(hirs)
## Treatment FGm0 FGm3 FGm6 FGm12
## 0:23 Min. :14.57 Min. : 4.381 Min. : 1.786 Min. :-1.163
## 1:26 1st Qu.:16.23 1st Qu.: 9.557 1st Qu.: 7.202 1st Qu.: 5.093
## 2:24 Median :17.65 Median :12.643 Median :10.286 Median : 7.524
## 3:26 Mean :18.57 Mean :13.084 Mean :10.853 Mean : 8.911
## 3rd Qu.:20.17 3rd Qu.:16.219 3rd Qu.:14.204 3rd Qu.:12.101
## Max. :28.36 Max. :25.637 Max. :23.411 Max. :22.759
##
## SysPres DiaPres weight height
## Min. : 88.0 Min. :46.00 Min. : 41.00 Min. :1.480
## 1st Qu.:110.0 1st Qu.:65.00 1st Qu.: 57.00 1st Qu.:1.580
## Median :115.0 Median :70.00 Median : 64.00 Median :1.610
## Mean :115.9 Mean :70.04 Mean : 68.06 Mean :1.613
## 3rd Qu.:120.0 3rd Qu.:75.00 3rd Qu.: 74.50 3rd Qu.:1.650
## Max. :162.0 Max. :95.00 Max. :113.00 Max. :1.800
## NA's :8 NA's :8 NA's :8 NA's :8
attach(hirs)
boxplot(hirs[, 2:5])
par(mfrow = c(2, 2))
boxplot(hirs[, 2] ~ Treatment, ylim = c(0, 30), main = names(hirs)[2], xlab = "Treatment")
boxplot(hirs[, 3] ~ Treatment, ylim = c(0, 30), main = names(hirs)[3], xlab = "Treatment")
boxplot(hirs[, 4] ~ Treatment, ylim = c(0, 30), main = names(hirs)[4], xlab = "Treatment")
boxplot(hirs[, 5] ~ Treatment, ylim = c(0, 30), main = names(hirs)[5], xlab = "Treatment")
par(mfrow = c(1, 1))
par(mfrow = c(2, 2))
boxplot(hirs[Treatment == 0, 2:5], ylim = c(0, 30), main = "Treatment 0")
boxplot(hirs[Treatment == 1, 2:5], ylim = c(0, 30), main = "Treatment 1")
boxplot(hirs[Treatment == 2, 2:5], ylim = c(0, 30), main = "Treatment 2")
boxplot(hirs[Treatment == 3, 2:5], ylim = c(0, 30), main = "Treatment 3")
par(mfrow = c(1, 1))
# Remove observations with missing data
hirs <- na.omit(hirs)
# Pacient 84 has FGm12 < 0, so we set it 0
hirs[hirs$FGm12 < 0, "FGm12"] <- 0
The boxplots show that all 4 treatments reduce the hirsutism level as months go by. Treatment 0 (only contraceptive) is the less effective one, while the other three are equally powerful on average, but with different distributions of levels.
We start the modelling of FGm12 by fitting a full linear model. After finding a good lm, we will build a semiparametric GAM based on it. Note that we cannot use variables FGm3 nor FGm6 as predictors.
gam1 <- gam(
FGm12 ~ Treatment + FGm0 + SysPres + DiaPres
+ weight + height,
data = hirs
)
summary(gam1)
##
## Family: gaussian
## Link function: identity
##
## Formula:
## FGm12 ~ Treatment + FGm0 + SysPres + DiaPres + weight + height
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 18.85710 14.79649 1.274 0.206111
## Treatment1 -4.32668 1.47552 -2.932 0.004360 **
## Treatment2 -4.29888 1.49026 -2.885 0.005005 **
## Treatment3 -3.87512 1.43820 -2.694 0.008551 **
## FGm0 0.59847 0.16798 3.563 0.000615 ***
## SysPres -0.07357 0.05174 -1.422 0.158889
## DiaPres 0.03294 0.07088 0.465 0.643340
## weight 0.02578 0.04409 0.585 0.560279
## height -8.27818 9.05146 -0.915 0.363100
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
##
## R-sq.(adj) = 0.169 Deviance explained = 24.3%
## GCV = 24.95 Scale est. = 22.482 n = 91
We shall remove non-significant variables.
gam4 <- gam(FGm12 ~ Treatment + FGm0,
data = hirs
)
summary(gam4)
##
## Family: gaussian
## Link function: identity
##
## Formula:
## FGm12 ~ Treatment + FGm0
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.6111 3.0551 0.200 0.841925
## Treatment1 -4.5812 1.4391 -3.183 0.002027 **
## Treatment2 -4.4290 1.4430 -3.069 0.002870 **
## Treatment3 -3.5497 1.3820 -2.568 0.011942 *
## FGm0 0.6217 0.1624 3.829 0.000244 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
##
## R-sq.(adj) = 0.179 Deviance explained = 21.6%
## GCV = 23.5 Scale est. = 22.208 n = 91
The model is pretty similar in terms of R-sq.(adj) and Deviance explained. The effect of FGm0 might vary with the treatment, so let us add interactions.
gam5 <- gam(FGm12 ~ FGm0 * Treatment,
data = hirs
)
summary(gam5)
##
## Family: gaussian
## Link function: identity
##
## Formula:
## FGm12 ~ FGm0 * Treatment
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 5.479373 8.236653 0.665 0.508
## FGm0 0.347752 0.460024 0.756 0.452
## Treatment1 -8.335161 9.930825 -0.839 0.404
## Treatment2 -4.164049 10.917542 -0.381 0.704
## Treatment3 -14.046703 9.737356 -1.443 0.153
## FGm0:Treatment1 0.215853 0.540888 0.399 0.691
## FGm0:Treatment2 0.008244 0.588629 0.014 0.989
## FGm0:Treatment3 0.579027 0.536859 1.079 0.284
##
##
## R-sq.(adj) = 0.171 Deviance explained = 23.5%
## GCV = 24.596 Scale est. = 22.434 n = 91
R-sq.(adj) has actually decreased when adding the interaction between FGm0 and Treatment.
We will now compare gam4 and gam5 against all other models with anova. After finding the best model so far, we will smooth more predictive variables.
anova(gam4, gam5, gam1, test = "F")
## Analysis of Deviance Table
##
## Model 1: FGm12 ~ Treatment + FGm0
## Model 2: FGm12 ~ FGm0 * Treatment
## Model 3: FGm12 ~ Treatment + FGm0 + SysPres + DiaPres + weight + height
## Resid. Df Resid. Dev Df Deviance F Pr(>F)
## 1 86 1909.9
## 2 83 1862.0 3 47.896 0.7101 0.5487
## 3 82 1843.5 1 18.485 0.8222 0.3672
anova(gam5, gam4, gam1, test = "F")
## Analysis of Deviance Table
##
## Model 1: FGm12 ~ FGm0 * Treatment
## Model 2: FGm12 ~ Treatment + FGm0
## Model 3: FGm12 ~ Treatment + FGm0 + SysPres + DiaPres + weight + height
## Resid. Df Resid. Dev Df Deviance F Pr(>F)
## 1 83 1862.0
## 2 86 1909.9 -3 -47.896 0.7101 0.5487
## 3 82 1843.5 4 66.381 0.7381 0.5686
We cannot reject the hypothesis that adding interactions does not improve the model.
We decide that the best model is gam4 because it is simpler than gam5.
vis.gam(gam4, plot.type = "persp", theta = 30, phi = 30, type = "response")
vis.gam(gam4, plot.type = "contour", type = "response", main = "FGm12 predicted with gam4")
Let us analyze the residuals of gam4.
gam.check(gam4)
##
## Method: GCV Optimizer: magic
## Model required no smoothing parameter selectionModel rank = 5 / 5
Residuals seem to be homoscedastic and centered around zero but they seem to be more left tailed than a normal distribution. Moreover the plot of the fitted vs real values shows that the model is not very precise in fitting the data, this could be due to underfitting. Therefore, our linear model seems to be to simple and suffers in catching the structure of the data.
From here, we can add smooth terms to obtain a semiparametric GAM.
We can first try to smooth FGm0.
gam4.1 <- gam(FGm12 ~ Treatment + s(FGm0),
data = hirs
)
summary(gam4.1)
##
## Family: gaussian
## Link function: identity
##
## Formula:
## FGm12 ~ Treatment + s(FGm0)
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 12.3682 0.9761 12.671 < 2e-16 ***
## Treatment1 -5.0772 1.3920 -3.648 0.000466 ***
## Treatment2 -4.5785 1.3902 -3.293 0.001467 **
## Treatment3 -3.5242 1.3417 -2.627 0.010305 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(FGm0) 5.74 6.869 4.012 0.00102 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.259 Deviance explained = 33.1%
## GCV = 22.447 Scale est. = 20.044 n = 91
R-sq.(adj) and Deviance explained increase significantly.
Now we will try to add interactions.
gam4.2 <- gam(FGm12 ~ s(FGm0, by = Treatment),
data = hirs
)
summary(gam4.2)
##
## Family: gaussian
## Link function: identity
##
## Formula:
## FGm12 ~ s(FGm0, by = Treatment)
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 8.248 0.542 15.22 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(FGm0):Treatment0 5.529 6.263 2.517 0.0256 *
## s(FGm0):Treatment1 1.849 2.329 2.579 0.0796 .
## s(FGm0):Treatment2 1.000 1.000 0.991 0.3225
## s(FGm0):Treatment3 3.957 4.867 4.157 0.0026 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.288 Deviance explained = 38.5%
## GCV = 22.58 Scale est. = 19.271 n = 91
R-sq.(adj) and explained Deviance improve slightly even if terms become less significant. Let us compare both models with anova.
anova(gam4, gam4.1, gam4.2, test = "F")
## Analysis of Deviance Table
##
## Model 1: FGm12 ~ Treatment + FGm0
## Model 2: FGm12 ~ Treatment + s(FGm0)
## Model 3: FGm12 ~ s(FGm0, by = Treatment)
## Resid. Df Resid. Dev Df Deviance F Pr(>F)
## 1 86.000 1909.9
## 2 80.131 1628.8 5.8685 281.13 2.4858 0.03104 *
## 3 75.540 1496.7 4.5911 132.09 1.4929 0.20649
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
anova(gam4.1, gam4.2, test = "F")
## Analysis of Deviance Table
##
## Model 1: FGm12 ~ Treatment + s(FGm0)
## Model 2: FGm12 ~ s(FGm0, by = Treatment)
## Resid. Df Resid. Dev Df Deviance F Pr(>F)
## 1 80.131 1628.8
## 2 75.540 1496.7 4.5911 132.09 1.4929 0.2065
We cannot reject the hypothesis that adding interactions does not improve the model. However, we will keep them in the following models because intuitively they should be useful. Later, with more terms, we will check again if interactions should be added.
Now we will introduce previously removed variables one by one.
gam4.4 <- gam(FGm12 ~ Treatment + s(FGm0) + s(SysPres),
data = hirs
)
summary(gam4.4)
##
## Family: gaussian
## Link function: identity
##
## Formula:
## FGm12 ~ Treatment + s(FGm0) + s(SysPres)
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 12.2573 0.9757 12.562 < 2e-16 ***
## Treatment1 -4.9105 1.3993 -3.509 0.000743 ***
## Treatment2 -4.3015 1.4037 -3.064 0.002977 **
## Treatment3 -3.5110 1.3316 -2.637 0.010067 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(FGm0) 5.811 6.936 3.891 0.00121 **
## s(SysPres) 1.711 2.164 1.097 0.37270
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.272 Deviance explained = 35.7%
## GCV = 22.542 Scale est. = 19.688 n = 91
s(SysPres) is not significant.
gam4.5 <- gam(FGm12 ~ Treatment + s(FGm0) + s(DiaPres),
data = hirs
)
summary(gam4.5)
##
## Family: gaussian
## Link function: identity
##
## Formula:
## FGm12 ~ Treatment + s(FGm0) + s(DiaPres)
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 12.3478 0.9831 12.560 < 2e-16 ***
## Treatment1 -5.0322 1.4090 -3.571 0.000609 ***
## Treatment2 -4.5344 1.4112 -3.213 0.001903 **
## Treatment3 -3.5283 1.3422 -2.629 0.010299 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(FGm0) 6.153 7.279 3.804 0.00105 **
## s(DiaPres) 2.115 2.661 0.990 0.52962
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.275 Deviance explained = 36.6%
## GCV = 22.676 Scale est. = 19.619 n = 91
s(DiaPres) is not significant.
gam4.6 <- gam(FGm12 ~ Treatment + s(FGm0) + s(weight),
data = hirs
)
summary(gam4.6)
##
## Family: gaussian
## Link function: identity
##
## Formula:
## FGm12 ~ Treatment + s(FGm0) + s(weight)
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 12.2937 0.9897 12.422 < 2e-16 ***
## Treatment1 -4.6889 1.4163 -3.311 0.00142 **
## Treatment2 -4.5484 1.4201 -3.203 0.00198 **
## Treatment3 -3.6212 1.3583 -2.666 0.00934 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(FGm0) 5.211 6.279 3.929 0.00143 **
## s(weight) 4.389 5.395 0.835 0.56494
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.268 Deviance explained = 37.1%
## GCV = 23.273 Scale est. = 19.795 n = 91
s(weight) is not significant.
gam4.7 <- gam(FGm12 ~ Treatment + s(FGm0) + s(height),
data = hirs
)
summary(gam4.7)
##
## Family: gaussian
## Link function: identity
##
## Formula:
## FGm12 ~ Treatment + s(FGm0) + s(height)
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 12.4056 0.9762 12.708 < 2e-16 ***
## Treatment1 -5.0209 1.3932 -3.604 0.000543 ***
## Treatment2 -4.5568 1.3905 -3.277 0.001552 **
## Treatment3 -3.7291 1.3646 -2.733 0.007727 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(FGm0) 5.930 7.061 4.046 0.000787 ***
## s(height) 1.001 1.002 0.853 0.358823
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.261 Deviance explained = 34.2%
## GCV = 22.726 Scale est. = 19.996 n = 91
s(height) is not significant.
We will now try tensor splines one by one.
gam4.8 <- gam(FGm12 ~ Treatment + s(FGm0) + te(SysPres, DiaPres),
data = hirs
)
summary(gam4.8)
##
## Family: gaussian
## Link function: identity
##
## Formula:
## FGm12 ~ Treatment + s(FGm0) + te(SysPres, DiaPres)
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 12.3083 0.9651 12.754 < 2e-16 ***
## Treatment1 -5.0462 1.3884 -3.635 0.000502 ***
## Treatment2 -4.2829 1.3914 -3.078 0.002889 **
## Treatment3 -3.5935 1.3168 -2.729 0.007876 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(FGm0) 6.346 7.461 3.814 0.00136 **
## te(SysPres,DiaPres) 3.996 4.635 1.480 0.19701
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.304 Deviance explained = 40.7%
## GCV = 22.366 Scale est. = 18.841 n = 91
Still, not significant. We will try with te(weight, height).
gam4.9 <- gam(FGm12 ~ Treatment + s(FGm0) + te(weight, height),
data = hirs
)
summary(gam4.9)
##
## Family: gaussian
## Link function: identity
##
## Formula:
## FGm12 ~ Treatment + s(FGm0) + te(weight, height)
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 12.6476 0.9869 12.816 < 2e-16 ***
## Treatment1 -5.2275 1.3936 -3.751 0.000338 ***
## Treatment2 -4.8505 1.4085 -3.444 0.000929 ***
## Treatment3 -4.1696 1.3981 -2.982 0.003825 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(FGm0) 6.238 7.339 4.283 0.000453 ***
## te(weight,height) 3.362 3.659 0.914 0.460328
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.275 Deviance explained = 37.7%
## GCV = 23.05 Scale est. = 19.605 n = 91
Not significant.
Now we will add interactions, one by one.
gam4.4.int <- gam(FGm12 ~ Treatment + s(FGm0) + s(SysPres, by = Treatment),
data = hirs
)
summary(gam4.4.int)
##
## Family: gaussian
## Link function: identity
##
## Formula:
## FGm12 ~ Treatment + s(FGm0) + s(SysPres, by = Treatment)
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 12.113 1.026 11.801 < 2e-16 ***
## Treatment1 -4.795 1.451 -3.304 0.00146 **
## Treatment2 -4.337 1.449 -2.993 0.00372 **
## Treatment3 -3.402 1.399 -2.432 0.01739 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(FGm0) 5.911 7.037 3.550 0.00222 **
## s(SysPres):Treatment0 1.492 1.832 0.860 0.33552
## s(SysPres):Treatment1 1.000 1.000 0.058 0.81104
## s(SysPres):Treatment2 1.000 1.001 0.012 0.91518
## s(SysPres):Treatment3 1.558 1.934 0.640 0.46465
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.262 Deviance explained = 37.6%
## GCV = 23.894 Scale est. = 19.966 n = 91
s(SysPres, by=Treatment) is not significant.
gam4.5.int <- gam(FGm12 ~ Treatment + s(FGm0) + s(DiaPres, by = Treatment),
data = hirs
)
summary(gam4.5.int)
##
## Family: gaussian
## Link function: identity
##
## Formula:
## FGm12 ~ Treatment + s(FGm0) + s(DiaPres, by = Treatment)
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 12.147 1.042 11.660 < 2e-16 ***
## Treatment1 -4.849 1.459 -3.323 0.00138 **
## Treatment2 -4.384 1.454 -3.015 0.00350 **
## Treatment3 -3.384 1.391 -2.433 0.01734 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(FGm0) 6.051 7.180 3.048 0.00633 **
## s(DiaPres):Treatment0 1.000 1.000 0.146 0.70304
## s(DiaPres):Treatment1 1.000 1.000 0.001 0.97320
## s(DiaPres):Treatment2 1.000 1.000 0.001 0.97783
## s(DiaPres):Treatment3 2.429 3.012 1.479 0.22289
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.275 Deviance explained = 39.1%
## GCV = 23.641 Scale est. = 19.619 n = 91
s(DiaPres, by=Treatment) is not significant.
gam4.6.int <- gam(FGm12 ~ Treatment + s(FGm0) + s(weight, by = Treatment),
data = hirs
)
summary(gam4.6.int)
##
## Family: gaussian
## Link function: identity
##
## Formula:
## FGm12 ~ Treatment + s(FGm0) + s(weight, by = Treatment)
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 12.111 1.046 11.573 < 2e-16 ***
## Treatment1 -4.927 1.504 -3.277 0.00159 **
## Treatment2 -4.378 1.469 -2.980 0.00388 **
## Treatment3 -3.322 1.394 -2.384 0.01964 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(FGm0) 5.337 6.426 3.996 0.00151 **
## s(weight):Treatment0 1.899 2.348 0.405 0.60837
## s(weight):Treatment1 1.431 1.736 0.397 0.53697
## s(weight):Treatment2 1.000 1.000 0.158 0.69169
## s(weight):Treatment3 1.503 1.846 0.375 0.64459
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.258 Deviance explained = 37.5%
## GCV = 24.074 Scale est. = 20.061 n = 91
s(weight, by=Treatment) is not significant.
gam4.7.int <- gam(FGm12 ~ Treatment + s(FGm0) + s(height, by = Treatment),
data = hirs
)
summary(gam4.7.int)
##
## Family: gaussian
## Link function: identity
##
## Formula:
## FGm12 ~ Treatment + s(FGm0) + s(height, by = Treatment)
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 12.738 1.068 11.928 < 2e-16 ***
## Treatment1 -5.461 1.436 -3.803 0.000299 ***
## Treatment2 -4.782 1.396 -3.426 0.001024 **
## Treatment3 -4.244 1.422 -2.983 0.003907 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(FGm0) 6.071 7.108 3.666 0.00171 **
## s(height):Treatment0 3.766 4.486 2.114 0.08190 .
## s(height):Treatment1 2.038 2.532 1.816 0.13289
## s(height):Treatment2 2.323 2.836 1.756 0.22275
## s(height):Treatment3 1.824 2.267 1.492 0.29097
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.398 Deviance explained = 52.5%
## GCV = 20.884 Scale est. = 16.289 n = 91
s(height, by=Treatment) is not significant.
We will now try tensor splines one by one.
gam4.8.int <- gam(FGm12 ~ Treatment + s(FGm0) + te(SysPres, DiaPres, by = Treatment),
data = hirs
)
summary(gam4.8.int)
##
## Family: gaussian
## Link function: identity
##
## Formula:
## FGm12 ~ Treatment + s(FGm0) + te(SysPres, DiaPres, by = Treatment)
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 12.940 1.160 11.154 < 2e-16 ***
## Treatment1 -5.723 1.545 -3.704 0.000431 ***
## Treatment2 -5.939 1.582 -3.755 0.000365 ***
## Treatment3 -4.218 1.470 -2.870 0.005484 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(FGm0) 6.296 7.401 2.808 0.0101 *
## te(SysPres,DiaPres):Treatment0 3.000 3.000 1.643 0.1876
## te(SysPres,DiaPres):Treatment1 3.792 4.334 0.417 0.7863
## te(SysPres,DiaPres):Treatment2 3.875 4.395 0.690 0.6552
## te(SysPres,DiaPres):Treatment3 3.000 3.000 3.282 0.0261 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.332 Deviance explained = 50.2%
## GCV = 24.545 Scale est. = 18.082 n = 91
Still, not significant. We will try with te(weight, height).
gam4.9.int <- gam(FGm12 ~ Treatment + s(FGm0) + te(weight, height, by = Treatment),
data = hirs
)
summary(gam4.9.int)
##
## Family: gaussian
## Link function: identity
##
## Formula:
## FGm12 ~ Treatment + s(FGm0) + te(weight, height, by = Treatment)
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 12.920 1.460 8.848 7.44e-13 ***
## Treatment1 -4.852 1.762 -2.754 0.00758 **
## Treatment2 -5.136 1.727 -2.973 0.00410 **
## Treatment3 -4.262 1.717 -2.483 0.01555 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(FGm0) 1.000 1.000 13.202 0.000543 ***
## te(weight,height):Treatment0 6.212 6.972 1.777 0.105300
## te(weight,height):Treatment1 3.000 3.000 2.451 0.070988 .
## te(weight,height):Treatment2 3.746 4.271 0.637 0.641862
## te(weight,height):Treatment3 6.351 7.128 2.508 0.023349 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.378 Deviance explained = 53.9%
## GCV = 22.967 Scale est. = 16.832 n = 91
Overall, it looks like the best model is gam4.2. Let us compare it will all other models with anova.
anova(gam4.2, gam4.4, gam4.5, gam4.6, gam4.7, gam4.8, gam4.9,
gam4.4.int, gam4.5.int, gam4.6.int, gam4.7.int, gam4.8.int, gam4.9.int,
test = "F"
)
## Analysis of Deviance Table
##
## Model 1: FGm12 ~ s(FGm0, by = Treatment)
## Model 2: FGm12 ~ Treatment + s(FGm0) + s(SysPres)
## Model 3: FGm12 ~ Treatment + s(FGm0) + s(DiaPres)
## Model 4: FGm12 ~ Treatment + s(FGm0) + s(weight)
## Model 5: FGm12 ~ Treatment + s(FGm0) + s(height)
## Model 6: FGm12 ~ Treatment + s(FGm0) + te(SysPres, DiaPres)
## Model 7: FGm12 ~ Treatment + s(FGm0) + te(weight, height)
## Model 8: FGm12 ~ Treatment + s(FGm0) + s(SysPres, by = Treatment)
## Model 9: FGm12 ~ Treatment + s(FGm0) + s(DiaPres, by = Treatment)
## Model 10: FGm12 ~ Treatment + s(FGm0) + s(weight, by = Treatment)
## Model 11: FGm12 ~ Treatment + s(FGm0) + s(height, by = Treatment)
## Model 12: FGm12 ~ Treatment + s(FGm0) + te(SysPres, DiaPres, by = Treatment)
## Model 13: FGm12 ~ Treatment + s(FGm0) + te(weight, height, by = Treatment)
## Resid. Df Resid. Dev Df Deviance F Pr(>F)
## 1 75.540 1496.7
## 2 77.900 1564.8 -2.35996 -68.05 1.7133 0.183041
## 3 77.060 1544.6 0.84066 20.15 1.4238 0.230910
## 4 75.326 1532.1 1.73348 12.50 0.4286 0.624649
## 5 78.937 1601.1 -3.61126 -68.98 1.1348 0.346110
## 6 74.905 1444.3 4.03264 156.79 2.3100 0.066696 .
## 7 76.002 1517.5 -1.09720 -73.16 3.9616 0.047165 *
## 8 74.196 1518.2 1.80618 -0.72
## 9 73.808 1481.6 0.38761 36.56 5.6042 0.046457 *
## 10 73.644 1521.2 0.16425 -39.61
## 11 67.770 1156.2 5.87409 365.07 3.6924 0.003438 **
## 12 64.870 1212.2 2.89951 -56.03
## 13 64.628 1122.5 0.24192 89.65 22.0174 0.003567 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
We remove the worst models.
anova(gam4.1, gam4.8, gam4.9,
gam4.5.int, gam4.7.int, gam4.9.int,
test = "F"
)
## Analysis of Deviance Table
##
## Model 1: FGm12 ~ Treatment + s(FGm0)
## Model 2: FGm12 ~ Treatment + s(FGm0) + te(SysPres, DiaPres)
## Model 3: FGm12 ~ Treatment + s(FGm0) + te(weight, height)
## Model 4: FGm12 ~ Treatment + s(FGm0) + s(DiaPres, by = Treatment)
## Model 5: FGm12 ~ Treatment + s(FGm0) + s(height, by = Treatment)
## Model 6: FGm12 ~ Treatment + s(FGm0) + te(weight, height, by = Treatment)
## Resid. Df Resid. Dev Df Deviance F Pr(>F)
## 1 80.131 1628.8
## 2 74.905 1444.3 5.2267 184.50 2.0972 0.07419 .
## 3 76.002 1517.5 -1.0972 -73.16 3.9616 0.04716 *
## 4 73.808 1481.6 2.1938 35.84 0.9707 0.39093
## 5 67.770 1156.2 6.0383 325.47 3.2023 0.00804 **
## 6 64.628 1122.5 3.1414 33.63 0.6360 0.60161
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
We do the same again.
anova(gam4.1, gam4.8, gam4.9,
gam4.7.int,
test = "F"
)
## Analysis of Deviance Table
##
## Model 1: FGm12 ~ Treatment + s(FGm0)
## Model 2: FGm12 ~ Treatment + s(FGm0) + te(SysPres, DiaPres)
## Model 3: FGm12 ~ Treatment + s(FGm0) + te(weight, height)
## Model 4: FGm12 ~ Treatment + s(FGm0) + s(height, by = Treatment)
## Resid. Df Resid. Dev Df Deviance F Pr(>F)
## 1 80.131 1628.8
## 2 74.905 1444.3 5.2267 184.50 2.1671 0.06521 .
## 3 76.002 1517.5 -1.0972 -73.16 4.0936 0.04342 *
## 4 67.770 1156.2 8.2321 361.31 2.6945 0.01164 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
It seems that adding s(height, by = Treatment) is useful. We can also try to add te(SysPres, DiaPres, by = Treatment).
gam4.10 <- gam(
FGm12 ~ Treatment + s(FGm0) + s(height, by = Treatment)
+ te(SysPres, DiaPres, by = Treatment),
data = hirs
)
summary(gam4.10)
##
## Family: gaussian
## Link function: identity
##
## Formula:
## FGm12 ~ Treatment + s(FGm0) + s(height, by = Treatment) + te(SysPres,
## DiaPres, by = Treatment)
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -5.065 11.382 -0.445 0.658
## Treatment1 12.352 11.419 1.082 0.285
## Treatment2 12.163 11.407 1.066 0.291
## Treatment3 13.484 11.403 1.182 0.243
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(FGm0) 5.957 6.967 3.794 0.00217 **
## s(height):Treatment0 5.401 5.917 1.317 0.26609
## s(height):Treatment1 2.351 2.834 2.381 0.05916 .
## s(height):Treatment2 1.960 2.354 1.843 0.21313
## s(height):Treatment3 2.408 2.982 1.796 0.13219
## te(SysPres,DiaPres):Treatment0 7.263 7.601 1.270 0.32081
## te(SysPres,DiaPres):Treatment1 4.312 5.000 0.622 0.68455
## te(SysPres,DiaPres):Treatment2 3.904 4.380 1.120 0.40402
## te(SysPres,DiaPres):Treatment3 3.000 3.000 4.306 0.00888 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.541 Deviance explained = 74.3%
## GCV = 22.4 Scale est. = 12.417 n = 91
R-sq.(adj) and Deviance explained have increased a lot. Let us check the residuals of gam4.10.
gam.check(gam4.10)
##
## Method: GCV Optimizer: magic
## Smoothing parameter selection converged after 27 iterations.
## The RMS GCV score gradient at convergence was 4.840213e-07 .
## The Hessian was positive definite.
## Model rank = 145 / 145
##
## Basis dimension (k) checking results. Low p-value (k-index<1) may
## indicate that k is too low, especially if edf is close to k'.
##
## k' edf k-index p-value
## s(FGm0) 9.00 5.96 1.16 0.94
## s(height):Treatment0 9.00 5.40 0.94 0.24
## s(height):Treatment1 9.00 2.35 0.94 0.28
## s(height):Treatment2 9.00 1.96 0.94 0.27
## s(height):Treatment3 9.00 2.41 0.94 0.18
## te(SysPres,DiaPres):Treatment0 24.00 7.26 1.12 0.89
## te(SysPres,DiaPres):Treatment1 24.00 4.31 1.04 0.61
## te(SysPres,DiaPres):Treatment2 24.00 3.90 1.15 0.91
## te(SysPres,DiaPres):Treatment3 24.00 3.00 1.12 0.78
The tensor spline has 25 knots, which are way more than the edf of its variables. Hence, we reduce them to 4^2 = 16.
gam4.11 <- gam(
FGm12 ~ Treatment + s(FGm0) + s(height, by = Treatment)
+ te(SysPres, DiaPres, by = Treatment, k = 4),
data = hirs
)
summary(gam4.11)
##
## Family: gaussian
## Link function: identity
##
## Formula:
## FGm12 ~ Treatment + s(FGm0) + s(height, by = Treatment) + te(SysPres,
## DiaPres, by = Treatment, k = 4)
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 12.858 1.245 10.331 9.57e-15 ***
## Treatment1 -5.564 1.580 -3.521 0.000847 ***
## Treatment2 -5.492 1.615 -3.401 0.001225 **
## Treatment3 -4.703 1.562 -3.011 0.003858 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(FGm0) 5.979 6.990 3.061 0.00787 **
## s(height):Treatment0 3.604 4.319 1.448 0.19943
## s(height):Treatment1 1.929 2.357 1.637 0.16382
## s(height):Treatment2 1.902 2.296 1.604 0.25767
## s(height):Treatment3 1.610 1.989 1.200 0.28269
## te(SysPres,DiaPres):Treatment0 3.040 3.077 0.517 0.69287
## te(SysPres,DiaPres):Treatment1 3.680 4.138 0.429 0.79633
## te(SysPres,DiaPres):Treatment2 3.751 4.156 0.795 0.51798
## te(SysPres,DiaPres):Treatment3 3.720 4.201 2.313 0.05767 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.442 Deviance explained = 64.1%
## GCV = 23.791 Scale est. = 15.107 n = 91
gam.check(gam4.11)
##
## Method: GCV Optimizer: magic
## Smoothing parameter selection converged after 27 iterations.
## The RMS GCV score gradient at convergence was 3.757111e-07 .
## The Hessian was positive definite.
## Model rank = 109 / 109
##
## Basis dimension (k) checking results. Low p-value (k-index<1) may
## indicate that k is too low, especially if edf is close to k'.
##
## k' edf k-index p-value
## s(FGm0) 9.00 5.98 1.10 0.82
## s(height):Treatment0 9.00 3.60 0.90 0.19
## s(height):Treatment1 9.00 1.93 0.90 0.14
## s(height):Treatment2 9.00 1.90 0.90 0.16
## s(height):Treatment3 9.00 1.61 0.90 0.18
## te(SysPres,DiaPres):Treatment0 15.00 3.04 1.20 0.95
## te(SysPres,DiaPres):Treatment1 15.00 3.68 1.19 0.97
## te(SysPres,DiaPres):Treatment2 15.00 3.75 1.19 0.96
## te(SysPres,DiaPres):Treatment3 15.00 3.72 1.20 0.96
anova(gam4.10, gam4.11, test = "F")
## Analysis of Deviance Table
##
## Model 1: FGm12 ~ Treatment + s(FGm0) + s(height, by = Treatment) + te(SysPres,
## DiaPres, by = Treatment)
## Model 2: FGm12 ~ Treatment + s(FGm0) + s(height, by = Treatment) + te(SysPres,
## DiaPres, by = Treatment, k = 4)
## Resid. Df Resid. Dev Df Deviance F Pr(>F)
## 1 45.965 626.38
## 2 53.478 872.98 -7.5131 -246.6 2.6434 0.01978 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Even though R-sq.(adj) and explained Deviance have decreased a little, we can reject that decreasing the number of knots of te(weight, height, by=Treatment) does not improve the model.
Maybe we need more interactions. We will start with s(FGm0).
gam4.12 <- gam(
FGm12 ~ Treatment + s(FGm0, by = Treatment) + s(height, by = Treatment)
+ te(SysPres, DiaPres, by = Treatment, k = 4),
data = hirs
)
summary(gam4.12)
##
## Family: gaussian
## Link function: identity
##
## Formula:
## FGm12 ~ Treatment + s(FGm0, by = Treatment) + s(height, by = Treatment) +
## te(SysPres, DiaPres, by = Treatment, k = 4)
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 12.230 1.207 10.132 5.34e-14 ***
## Treatment1 -5.260 1.525 -3.448 0.00111 **
## Treatment2 -5.400 1.598 -3.378 0.00137 **
## Treatment3 -3.972 1.592 -2.494 0.01578 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(FGm0):Treatment0 1.000 1.000 0.002 0.96680
## s(FGm0):Treatment1 4.647 5.531 2.475 0.03546 *
## s(FGm0):Treatment2 1.000 1.000 0.655 0.42186
## s(FGm0):Treatment3 4.053 4.892 4.800 0.00121 **
## s(height):Treatment0 3.786 4.519 1.132 0.27648
## s(height):Treatment1 1.378 1.627 0.590 0.39446
## s(height):Treatment2 1.463 1.772 0.474 0.65956
## s(height):Treatment3 1.000 1.000 3.404 0.07061 .
## te(SysPres,DiaPres):Treatment0 3.000 3.000 0.610 0.61149
## te(SysPres,DiaPres):Treatment1 3.595 3.995 0.596 0.67454
## te(SysPres,DiaPres):Treatment2 4.509 5.048 1.301 0.27995
## te(SysPres,DiaPres):Treatment3 4.597 5.238 3.700 0.00749 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.496 Deviance explained = 70.4%
## GCV = 23.406 Scale est. = 13.625 n = 91
R-sq.(adj) and explained Deviance have increased again. weight is the only baseline predictor we are not using; let us add it again, but now interacting with Treatment.
gam4.13 <- gam(
FGm12 ~ s(FGm0, by = Treatment) + s(height, by = Treatment)
+ s(weight, by = Treatment)
+ te(SysPres, DiaPres, by = Treatment, k = 4),
data = hirs
)
summary(gam4.13)
##
## Family: gaussian
## Link function: identity
##
## Formula:
## FGm12 ~ s(FGm0, by = Treatment) + s(height, by = Treatment) +
## s(weight, by = Treatment) + te(SysPres, DiaPres, by = Treatment,
## k = 4)
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 7.8015 0.3022 25.82 6.63e-13 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(FGm0):Treatment0 6.398 6.516 19.844 8.05e-06 ***
## s(FGm0):Treatment1 8.743 8.912 22.964 1.54e-06 ***
## s(FGm0):Treatment2 1.000 1.000 1.634 0.221950
## s(FGm0):Treatment3 6.457 7.131 44.199 < 2e-16 ***
## s(height):Treatment0 7.065 7.206 25.957 1.41e-06 ***
## s(height):Treatment1 2.322 2.750 4.637 0.021204 *
## s(height):Treatment2 3.364 3.850 2.807 0.067149 .
## s(height):Treatment3 1.000 1.000 28.625 0.000105 ***
## s(weight):Treatment0 1.000 1.000 14.590 0.001880 **
## s(weight):Treatment1 1.859 2.199 6.582 0.008554 **
## s(weight):Treatment2 7.926 8.409 21.405 7.64e-06 ***
## s(weight):Treatment3 4.501 5.192 3.004 0.038868 *
## te(SysPres,DiaPres):Treatment0 6.177 6.279 15.830 1.77e-05 ***
## te(SysPres,DiaPres):Treatment1 4.438 4.994 5.180 0.006575 **
## te(SysPres,DiaPres):Treatment2 6.411 6.703 27.106 8.59e-06 ***
## te(SysPres,DiaPres):Treatment3 7.806 8.638 24.197 4.62e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.965 Deviance explained = 99.5%
## GCV = 6.2762 Scale est. = 0.9334 n = 91
This seems to be a good model. Let us check that the number of knots is a proper one.
gam.check(gam4.13)
##
## Method: GCV Optimizer: magic
## Smoothing parameter selection converged after 258 iterations.
## The RMS GCV score gradient at convergence was 0.0008328835 .
## The Hessian was not positive definite.
## Model rank = 169 / 169
##
## Basis dimension (k) checking results. Low p-value (k-index<1) may
## indicate that k is too low, especially if edf is close to k'.
##
## k' edf k-index p-value
## s(FGm0):Treatment0 9.00 6.40 1.26 0.99
## s(FGm0):Treatment1 9.00 8.74 1.26 0.99
## s(FGm0):Treatment2 9.00 1.00 1.26 0.98
## s(FGm0):Treatment3 9.00 6.46 1.26 0.99
## s(height):Treatment0 9.00 7.06 0.98 0.41
## s(height):Treatment1 9.00 2.32 0.98 0.38
## s(height):Treatment2 9.00 3.36 0.98 0.38
## s(height):Treatment3 9.00 1.00 0.98 0.39
## s(weight):Treatment0 9.00 1.00 1.12 0.80
## s(weight):Treatment1 9.00 1.86 1.12 0.86
## s(weight):Treatment2 9.00 7.93 1.12 0.80
## s(weight):Treatment3 9.00 4.50 1.12 0.86
## te(SysPres,DiaPres):Treatment0 15.00 6.18 1.09 0.78
## te(SysPres,DiaPres):Treatment1 15.00 4.44 1.16 0.91
## te(SysPres,DiaPres):Treatment2 15.00 6.41 1.17 0.93
## te(SysPres,DiaPres):Treatment3 15.00 7.81 1.18 0.95
Gam.13 shows nice fitted values compared to real ones, however residuals are not normally distributed and seem to have heavier tails, highlighting that in some cases the model deviates significantly from the real values.
The number of knots of s(FGm0, by=Treatment) should be higher, but fitting gam4.13 is so slow that adding more knots causes the algorithm to crash.
For some treatments, FGm0, height and weight seem to be linear. Let us make them linear terms one by one.
gam4.13.1 <- gam(
FGm12 ~ FGm0 * Treatment + s(height, by = Treatment)
+ s(weight, by = Treatment)
+ te(SysPres, DiaPres, by = Treatment, k = 4),
data = hirs
)
summary(gam4.13.1)
##
## Family: gaussian
## Link function: identity
##
## Formula:
## FGm12 ~ FGm0 * Treatment + s(height, by = Treatment) + s(weight,
## by = Treatment) + te(SysPres, DiaPres, by = Treatment, k = 4)
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 8.3103 9.2059 0.903 0.3709
## FGm0 0.2653 0.4979 0.533 0.5965
## Treatment1 -11.2548 10.8179 -1.040 0.3031
## Treatment2 -10.3307 12.0184 -0.860 0.3940
## Treatment3 -28.3519 11.8319 -2.396 0.0203 *
## FGm0:Treatment1 0.3148 0.5686 0.554 0.5822
## FGm0:Treatment2 0.2195 0.6516 0.337 0.7376
## FGm0:Treatment3 1.1934 0.6318 1.889 0.0646 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(height):Treatment0 5.293 5.925 1.921 0.09778 .
## s(height):Treatment1 2.141 2.628 2.033 0.12375
## s(height):Treatment2 2.094 2.584 3.192 0.04227 *
## s(height):Treatment3 1.555 1.904 1.654 0.14461
## s(weight):Treatment0 1.000 1.000 3.340 0.07348 .
## s(weight):Treatment1 1.000 1.000 0.066 0.79837
## s(weight):Treatment2 1.000 1.000 10.915 0.00175 **
## s(weight):Treatment3 1.769 2.168 0.787 0.41892
## te(SysPres,DiaPres):Treatment0 3.000 3.000 0.520 0.67053
## te(SysPres,DiaPres):Treatment1 3.788 4.298 0.679 0.68169
## te(SysPres,DiaPres):Treatment2 4.125 4.719 3.125 0.01532 *
## te(SysPres,DiaPres):Treatment3 5.199 5.888 3.937 0.00239 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.553 Deviance explained = 74.6%
## GCV = 21.572 Scale est. = 12.098 n = 91
anova(gam4.13, gam4.13.1, test = "F")
## Analysis of Deviance Table
##
## Model 1: FGm12 ~ s(FGm0, by = Treatment) + s(height, by = Treatment) +
## s(weight, by = Treatment) + te(SysPres, DiaPres, by = Treatment,
## k = 4)
## Model 2: FGm12 ~ FGm0 * Treatment + s(height, by = Treatment) + s(weight,
## by = Treatment) + te(SysPres, DiaPres, by = Treatment, k = 4)
## Resid. Df Resid. Dev Df Deviance F Pr(>F)
## 1 8.221 12.63
## 2 46.887 617.44 -38.666 -604.8 16.758 0.0001177 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
According to anova, it is better to treat FGm0 linearly, but if we do so, then predictors are less significant and R-sq.(adj) and explained Deviance decrease a lot.
Let us try with height.
gam4.13.2 <- gam(
FGm12 ~ s(FGm0, by = Treatment) + height * Treatment
+ s(weight, by = Treatment)
+ te(SysPres, DiaPres, by = Treatment, k = 4),
data = hirs
)
summary(gam4.13.2)
##
## Family: gaussian
## Link function: identity
##
## Formula:
## FGm12 ~ s(FGm0, by = Treatment) + height * Treatment + s(weight,
## by = Treatment) + te(SysPres, DiaPres, by = Treatment, k = 4)
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 32.716 37.281 0.878 0.384
## height -12.821 22.992 -0.558 0.579
## Treatment1 -33.897 48.194 -0.703 0.485
## Treatment2 17.858 53.502 0.334 0.740
## Treatment3 9.314 42.763 0.218 0.828
## height:Treatment1 18.413 29.630 0.621 0.537
## height:Treatment2 -14.813 32.925 -0.450 0.655
## height:Treatment3 -7.718 26.490 -0.291 0.772
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(FGm0):Treatment0 1.000 1.000 0.319 0.574477
## s(FGm0):Treatment1 4.760 5.682 2.715 0.024192 *
## s(FGm0):Treatment2 1.818 2.284 0.543 0.578144
## s(FGm0):Treatment3 4.206 5.058 5.375 0.000461 ***
## s(weight):Treatment0 1.570 1.915 0.932 0.467985
## s(weight):Treatment1 1.000 1.000 3.271 0.076304 .
## s(weight):Treatment2 1.000 1.000 8.059 0.006445 **
## s(weight):Treatment3 1.394 1.662 1.300 0.183937
## te(SysPres,DiaPres):Treatment0 3.368 3.689 1.843 0.177343
## te(SysPres,DiaPres):Treatment1 3.000 3.000 1.265 0.296186
## te(SysPres,DiaPres):Treatment2 4.706 5.245 2.919 0.020408 *
## te(SysPres,DiaPres):Treatment3 3.000 3.000 5.976 0.001404 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.517 Deviance explained = 72%
## GCV = 22.8 Scale est. = 13.073 n = 91
anova(gam4.13, gam4.13.2, test = "F")
## Analysis of Deviance Table
##
## Model 1: FGm12 ~ s(FGm0, by = Treatment) + s(height, by = Treatment) +
## s(weight, by = Treatment) + te(SysPres, DiaPres, by = Treatment,
## k = 4)
## Model 2: FGm12 ~ s(FGm0, by = Treatment) + height * Treatment + s(weight,
## by = Treatment) + te(SysPres, DiaPres, by = Treatment, k = 4)
## Resid. Df Resid. Dev Df Deviance F Pr(>F)
## 1 8.221 12.63
## 2 48.464 682.15 -40.243 -669.52 17.824 9.17e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
We obtain analogous results. Let us finally try with weight.
gam4.13.3 <- gam(
FGm12 ~ s(FGm0, by = Treatment) + s(height, by = Treatment)
+ weight * Treatment
+ te(SysPres, DiaPres, by = Treatment, k = 4),
data = hirs
)
summary(gam4.13.3)
##
## Family: gaussian
## Link function: identity
##
## Formula:
## FGm12 ~ s(FGm0, by = Treatment) + s(height, by = Treatment) +
## weight * Treatment + te(SysPres, DiaPres, by = Treatment,
## k = 4)
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 25.52851 33.81411 0.755 0.4557
## weight 0.17292 0.09130 1.894 0.0671 .
## Treatment1 -34.96403 34.31610 -1.019 0.3157
## Treatment2 -37.41467 33.98955 -1.101 0.2790
## Treatment3 -13.58996 34.02578 -0.399 0.6922
## weight:Treatment1 0.07347 0.12873 0.571 0.5720
## weight:Treatment2 0.10460 0.10378 1.008 0.3209
## weight:Treatment3 -0.22138 0.10450 -2.118 0.0418 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(FGm0):Treatment0 5.223 5.412 5.468 0.000824 ***
## s(FGm0):Treatment1 7.384 8.078 6.330 5.38e-05 ***
## s(FGm0):Treatment2 1.018 1.031 4.188 0.045268 *
## s(FGm0):Treatment3 5.156 6.094 10.851 1.43e-06 ***
## s(height):Treatment0 6.904 7.139 5.438 0.000196 ***
## s(height):Treatment1 1.000 1.000 0.827 0.369817
## s(height):Treatment2 2.955 3.570 6.292 0.001029 **
## s(height):Treatment3 1.152 1.254 4.982 0.016408 *
## te(SysPres,DiaPres):Treatment0 6.063 6.216 3.759 0.005170 **
## te(SysPres,DiaPres):Treatment1 3.000 3.000 3.752 0.020101 *
## te(SysPres,DiaPres):Treatment2 5.274 6.248 6.921 6.54e-05 ***
## te(SysPres,DiaPres):Treatment3 5.092 5.721 10.266 7.84e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.832 Deviance explained = 93.9%
## GCV = 12.632 Scale est. = 4.5502 n = 91
anova(gam4.13, gam4.13.3, test = "F")
## Analysis of Deviance Table
##
## Model 1: FGm12 ~ s(FGm0, by = Treatment) + s(height, by = Treatment) +
## s(weight, by = Treatment) + te(SysPres, DiaPres, by = Treatment,
## k = 4)
## Model 2: FGm12 ~ s(FGm0, by = Treatment) + s(height, by = Treatment) +
## weight * Treatment + te(SysPres, DiaPres, by = Treatment,
## k = 4)
## Resid. Df Resid. Dev Df Deviance F Pr(>F)
## 1 8.221 12.632
## 2 28.236 149.146 -20.015 -136.51 7.3073 0.003172 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
With weight, the worsening effect is much smaller. Let us compare the residuals of gam4.13 and gam4.13.3 to get a better picture of the situation.
gam.check(gam4.13)
##
## Method: GCV Optimizer: magic
## Smoothing parameter selection converged after 258 iterations.
## The RMS GCV score gradient at convergence was 0.0008328835 .
## The Hessian was not positive definite.
## Model rank = 169 / 169
##
## Basis dimension (k) checking results. Low p-value (k-index<1) may
## indicate that k is too low, especially if edf is close to k'.
##
## k' edf k-index p-value
## s(FGm0):Treatment0 9.00 6.40 1.26 1.00
## s(FGm0):Treatment1 9.00 8.74 1.26 0.97
## s(FGm0):Treatment2 9.00 1.00 1.26 0.99
## s(FGm0):Treatment3 9.00 6.46 1.26 0.99
## s(height):Treatment0 9.00 7.06 0.98 0.39
## s(height):Treatment1 9.00 2.32 0.98 0.35
## s(height):Treatment2 9.00 3.36 0.98 0.33
## s(height):Treatment3 9.00 1.00 0.98 0.34
## s(weight):Treatment0 9.00 1.00 1.12 0.86
## s(weight):Treatment1 9.00 1.86 1.12 0.81
## s(weight):Treatment2 9.00 7.93 1.12 0.85
## s(weight):Treatment3 9.00 4.50 1.12 0.85
## te(SysPres,DiaPres):Treatment0 15.00 6.18 1.09 0.78
## te(SysPres,DiaPres):Treatment1 15.00 4.44 1.16 0.94
## te(SysPres,DiaPres):Treatment2 15.00 6.41 1.17 0.94
## te(SysPres,DiaPres):Treatment3 15.00 7.81 1.18 0.94
gam.check(gam4.13.3)
##
## Method: GCV Optimizer: magic
## Smoothing parameter selection converged after 148 iterations by steepest
## descent step failure.
## The RMS GCV score gradient at convergence was 7.198955e-07 .
## The Hessian was positive definite.
## Model rank = 140 / 140
##
## Basis dimension (k) checking results. Low p-value (k-index<1) may
## indicate that k is too low, especially if edf is close to k'.
##
## k' edf k-index p-value
## s(FGm0):Treatment0 9.00 5.22 1.26 0.99
## s(FGm0):Treatment1 9.00 7.38 1.26 1.00
## s(FGm0):Treatment2 9.00 1.02 1.26 1.00
## s(FGm0):Treatment3 9.00 5.16 1.26 0.98
## s(height):Treatment0 9.00 6.90 0.93 0.20
## s(height):Treatment1 9.00 1.00 0.93 0.24
## s(height):Treatment2 9.00 2.96 0.93 0.22
## s(height):Treatment3 9.00 1.15 0.93 0.17
## te(SysPres,DiaPres):Treatment0 15.00 6.06 0.96 0.28
## te(SysPres,DiaPres):Treatment1 15.00 3.00 1.02 0.56
## te(SysPres,DiaPres):Treatment2 15.00 5.27 1.09 0.81
## te(SysPres,DiaPres):Treatment3 15.00 5.09 1.08 0.76
The residuals of gam4.13.3 are more normally distributed and homoscedastic than those of gam4.13. Hence, each model has different pros and cons: gam4.13 explains almost all of the variance in the data, but its residuals are not normally distributed nor hoomoscedastic, while gam4.13.3 explains a lot of the variance in the data (but not almost all of it) and its residuals follow the modelsā assumptions better. In other words, gam4.13.3 shows a tradeoff between variance explanation and residualsā quality.
Moreover, gam4.13 might overfit the data, while gam4.13.3 might generalize better, since it is more parsimonious.
We can also compare the fitted linear model with gam4.13.3.
summary(gam4)
##
## Family: gaussian
## Link function: identity
##
## Formula:
## FGm12 ~ Treatment + FGm0
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.6111 3.0551 0.200 0.841925
## Treatment1 -4.5812 1.4391 -3.183 0.002027 **
## Treatment2 -4.4290 1.4430 -3.069 0.002870 **
## Treatment3 -3.5497 1.3820 -2.568 0.011942 *
## FGm0 0.6217 0.1624 3.829 0.000244 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
##
## R-sq.(adj) = 0.179 Deviance explained = 21.6%
## GCV = 23.5 Scale est. = 22.208 n = 91
summary(gam4.13.3)
##
## Family: gaussian
## Link function: identity
##
## Formula:
## FGm12 ~ s(FGm0, by = Treatment) + s(height, by = Treatment) +
## weight * Treatment + te(SysPres, DiaPres, by = Treatment,
## k = 4)
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 25.52851 33.81411 0.755 0.4557
## weight 0.17292 0.09130 1.894 0.0671 .
## Treatment1 -34.96403 34.31610 -1.019 0.3157
## Treatment2 -37.41467 33.98955 -1.101 0.2790
## Treatment3 -13.58996 34.02578 -0.399 0.6922
## weight:Treatment1 0.07347 0.12873 0.571 0.5720
## weight:Treatment2 0.10460 0.10378 1.008 0.3209
## weight:Treatment3 -0.22138 0.10450 -2.118 0.0418 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(FGm0):Treatment0 5.223 5.412 5.468 0.000824 ***
## s(FGm0):Treatment1 7.384 8.078 6.330 5.38e-05 ***
## s(FGm0):Treatment2 1.018 1.031 4.188 0.045268 *
## s(FGm0):Treatment3 5.156 6.094 10.851 1.43e-06 ***
## s(height):Treatment0 6.904 7.139 5.438 0.000196 ***
## s(height):Treatment1 1.000 1.000 0.827 0.369817
## s(height):Treatment2 2.955 3.570 6.292 0.001029 **
## s(height):Treatment3 1.152 1.254 4.982 0.016408 *
## te(SysPres,DiaPres):Treatment0 6.063 6.216 3.759 0.005170 **
## te(SysPres,DiaPres):Treatment1 3.000 3.000 3.752 0.020101 *
## te(SysPres,DiaPres):Treatment2 5.274 6.248 6.921 6.54e-05 ***
## te(SysPres,DiaPres):Treatment3 5.092 5.721 10.266 7.84e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.832 Deviance explained = 93.9%
## GCV = 12.632 Scale est. = 4.5502 n = 91
anova(gam4, gam4.13.3, test = "F")
## Analysis of Deviance Table
##
## Model 1: FGm12 ~ Treatment + FGm0
## Model 2: FGm12 ~ s(FGm0, by = Treatment) + s(height, by = Treatment) +
## weight * Treatment + te(SysPres, DiaPres, by = Treatment,
## k = 4)
## Resid. Df Resid. Dev Df Deviance F Pr(>F)
## 1 86.000 1909.93
## 2 28.236 149.15 57.764 1760.8 6.6992 3.049e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The improvement is clear, since the linear model was very limited. Hence, we finally decide gam4.13.3 as our best model:
FGm12 ~ s(FGm0, by=Treatment) + s(height, by=Treatment) + weight*Treatment + te(SysPres, DiaPres, by=Treatment, k=4)
Let us interpret and visualize gam4.13.3.
summary(gam4.13.3)
##
## Family: gaussian
## Link function: identity
##
## Formula:
## FGm12 ~ s(FGm0, by = Treatment) + s(height, by = Treatment) +
## weight * Treatment + te(SysPres, DiaPres, by = Treatment,
## k = 4)
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 25.52851 33.81411 0.755 0.4557
## weight 0.17292 0.09130 1.894 0.0671 .
## Treatment1 -34.96403 34.31610 -1.019 0.3157
## Treatment2 -37.41467 33.98955 -1.101 0.2790
## Treatment3 -13.58996 34.02578 -0.399 0.6922
## weight:Treatment1 0.07347 0.12873 0.571 0.5720
## weight:Treatment2 0.10460 0.10378 1.008 0.3209
## weight:Treatment3 -0.22138 0.10450 -2.118 0.0418 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(FGm0):Treatment0 5.223 5.412 5.468 0.000824 ***
## s(FGm0):Treatment1 7.384 8.078 6.330 5.38e-05 ***
## s(FGm0):Treatment2 1.018 1.031 4.188 0.045268 *
## s(FGm0):Treatment3 5.156 6.094 10.851 1.43e-06 ***
## s(height):Treatment0 6.904 7.139 5.438 0.000196 ***
## s(height):Treatment1 1.000 1.000 0.827 0.369817
## s(height):Treatment2 2.955 3.570 6.292 0.001029 **
## s(height):Treatment3 1.152 1.254 4.982 0.016408 *
## te(SysPres,DiaPres):Treatment0 6.063 6.216 3.759 0.005170 **
## te(SysPres,DiaPres):Treatment1 3.000 3.000 3.752 0.020101 *
## te(SysPres,DiaPres):Treatment2 5.274 6.248 6.921 6.54e-05 ***
## te(SysPres,DiaPres):Treatment3 5.092 5.721 10.266 7.84e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.832 Deviance explained = 93.9%
## GCV = 12.632 Scale est. = 4.5502 n = 91
Overall, we have fit a semiparametric GAM to the hirsutism dataset in order to predict the hirsutism level observed on patients after following a certain treatment for a year.
There are 4 different treatments and, according to the model, each affects the final hirsutism level differently based on the baseline height, weight, systolic blood pressure, diastolic blood pressure and hirsutism level. Baseline measures were taken at the start of the clinical trial.
The effects of FGm0 and height are estimated independently with non-parametric estimators. On the other hand, the term corresponding to the weight predictor has been estimated linearly to improve the parsimony and the residualsā quality of the model. Finally, SysPres and DiaPres form a tensor spline with \(4^2\) knots.
As mentioned before, all terms interact with Treatment, which shows that, depending on the characteristics of the patient, one treatment or another should be prescribed. Nonetheless, treatments 1 and 2 reduce the hirsutism level the most on average. Treatment 3 could be prescribed to heavy people, however.
On average, treatment 0 (only contraceptive) does not show a significant reduction of the hirsutism level after a year of treatment, causing its expected value to be about 26. Treatments 1 and 2 (with antiandrogen), however, eradicate hirsutism level after a year, since their expected value is negative (and hence null). Finally, treatment 3 (also with antiandrogen) is expected to decrease the hirsutism level to about 12 in the same period of time.
Next, we will plot the non-parametric regression terms of the semiparametric GAM to identify what treatment is most suitable to different kinds of patients, as well as to find out about how the baseline characteristics of people affect their final hirsutism level on average.
par(mfrow = c(1, 1))
vis.gam(gam4.13.3, view = c("height", "weight"), plot.type = "persp", theta = 30, phi = 30)
vis.gam(gam4.13.3, view = c("height", "weight"), plot.type = "contour", main = "FGm12")
The height term seems almost constant on average. In addition, tall heavy people will have the smallest average hirsutism levels at the end of their treatment.
In the summary of the model, we can see that the weight coefficient is lowest for treatment 3. Hence, heavy people should undertake treatment 3.
plot(gam4.13.3, residuals = TRUE, shade = TRUE, seWithMean = TRUE, pages = 4)
s(FGm0):Treatment0 and s(height):Treatment0 have high standard errors at the boundaries and it is difficult to observe their fitted non-parametric regressions. This is probably caused by the lack of short or tall patients with high FGm0 (not necessarily both) that undertook treatment 0.
Let us look at the the smooth terms of treatments 1, 2 and 3.
# Plot smooth terms one by one, skipping Treatment0
par(mfrow = c(2, 3))
for (i in c(2, 3, 4, 6, 7, 8)) {
plot(gam4.13.3,
residuals = TRUE, shade = TRUE, seWithMean = TRUE, select = i,
ylim = c(-20, 20)
)
}
par(mfrow = c(1, 1))
For treatment 1, it is best to have a baseline hirsutism level around 20 and, less significantly, to be short.
For treatment 2, the lower the baseline hirsutism level and, more significantly, the taller the patient is, the better.
For treatment 3, we recommend to prescribe it to tall patients with a low baseline hirsutism level.
Now, let us look at the same terms for treatment 0.
par(mfrow = c(1, 2))
# FGm0
plot(gam4.13.3,
residuals = TRUE, shade = TRUE, seWithMean = TRUE, select = 1,
ylim = c(-100, 200)
)
# height
plot(gam4.13.3,
residuals = TRUE, shade = TRUE, seWithMean = TRUE, select = 5,
ylim = c(-500, 200)
)
par(mfrow = c(1, 1))
Let us zoom at the y-range with the biggest amount of patients.
par(mfrow = c(1, 2))
# FGm0
plot(gam4.13.3,
residuals = TRUE, shade = TRUE, seWithMean = TRUE, select = 1,
ylim = c(-100, 100)
)
# height
plot(gam4.13.3,
residuals = TRUE, shade = TRUE, seWithMean = TRUE, select = 5,
ylim = c(-100, 100)
)
par(mfrow = c(1, 1))
Even though more data would be needed, we can approximately state that treatment 0 could be appropriate for tall people with a low baseline hirsutism level. However, treatment 2, which is more effective on average than treatment 0, would also be suitable for those patients. Consequently, other characteristics of the patient should be considered to opt for treatment 0.
par(mfrow = c(1, 1))
vis.gam(gam4.13.3, view = c("SysPres", "DiaPres"), plot.type = "persp", theta = 30, phi = 30)
vis.gam(gam4.13.3, view = c("SysPres", "DiaPres"), plot.type = "contour", main = "FGm12")
The average hirsutism level after 1 year of treatment is lowest for patients with high diastolic blood pressure and low systolic pressure simultaneously or low diastolic blood pressure and high systolic blood pressure simultaneously as well. We will now check what treatment is most suitable according to the baseline systolic and diastolic blood pressures of patients.
# Treatment 0
plot(gam4.13.3, residuals = TRUE, shade = TRUE, seWithMean = TRUE, select = 9)
# Treatment 1
plot(gam4.13.3, residuals = TRUE, shade = TRUE, seWithMean = TRUE, select = 10)
# Treatment 2
plot(gam4.13.3, residuals = TRUE, shade = TRUE, seWithMean = TRUE, select = 11)
# Treatment 3
plot(gam4.13.3, residuals = TRUE, shade = TRUE, seWithMean = TRUE, select = 12)
To conclude, the semiparametric GAM we have fitted has allowed us not only to predict the hirsutism level after the treatment of a patient, but also to recommend him/her/them the most appropriate treatment according to his/her/their hirsutism level, height, weight and blood pressure. Moreover, our model explains 93.9% of the deviance, has an adjusted \(R^2\) of 0.832 and its residuals are normally distributed homoscedastic, making it correct, precise and able to generalize to new observations. Nonetheless, more short or tall patients with high FGm0 (not necessarily both) that undertook treatment 0 would have allowed us to better understand the base treatment without antiandrogen.